In many ecological and environmental settings, measurements are taken from fixed sampling units, aiming to quantify spatial variation and interpolate values at unobserved sites.
For a set of geostatistical data \(\mathbf{z} = \{ z(\mathbf{s}_1), \ldots, z(\mathbf{s}_m) \}\), we can consider the general model:
\[Z(\mathbf{s}_i) = \mu(\mathbf{s}_i) + e(\mathbf{s}_i)\]
\(\mu(\mathbf{s}_i)\) is a mean function which models trend and covariate effects.
Then \(e(\mathbf{s}_i)\) is the error process which accounts for any spatial correlation which exists after accounting for \(\mu(\mathbf{s}_i)\)
Spatial statistics is therefore often focused on understanding the process for \(e(\mathbf{s}_i)\).
The first step is to assess whether there is any evidence of spatial dependency in our data.
Spatial dependence in georeferenced data can be explored by a function known as a variogram \(2\gamma(\cdot)\) (or semivariogram \(\gamma(\cdot)\)).
The variogram is similar in many ways to the autocorrelation function used in time series modelling.
In simple terms, it is a function which measures the difference in the spatial process between a pair of locations a fixed distance apart.
The variogram measures the variance of the difference in the process \(Z(\cdot)\) at two spatial locations \(\mathbf{s}\) and \(\mathbf{s+h}\) and is defined as :
\[\mathrm{Var}[Z(\mathbf{s}) - Z(\mathbf{s} + \mathbf{h})] = E[(Z(\mathbf{s}) - Z(\mathbf{s} + \mathbf{h}))^2] = 2\gamma_z(\mathbf{h}).\]
Note that in practice we use the semi-variogram \(\gamma_z(\mathbf{h})\) because our points come in pairs, and the semi-variance is equivalent to the variance per point at a given lag.
When the variance of the difference \(Z(\mathbf{s}) - Z(\mathbf{s} + \mathbf{h})\) is relatively small, then \(Z(\mathbf{s})\) and \(Z(\mathbf{s} + \mathbf{h})\) are similar (spatially correlated).
When the variance of the difference \(Z(\mathbf{s}) - Z(\mathbf{s} + \mathbf{h})\) is relatively large, then \(Z(\mathbf{s})\) and \(Z(\mathbf{s} + \mathbf{h})\) are less similar (closer to independence).
A plot of the empirical semivariogram against the separation distance conveys important information about the continuity and spatial variability of the process.
In practice, we only have access to \(m\) realisations of this process, and therefore we have to estimate the variogram. This is known as the empirical variogram.
The empirical semivariogram can be used as exploratory tool to assess whether data present spatial correlation.
We obtain this by computing the semi-variance for all possible pairs of observations: \(\gamma(\mathbf{s}, \mathbf{s} + \mathbf{h}) = 0.5(Z(\mathbf{s}) - Z(\mathbf{s} + \mathbf{h}))^2\).
The data Paraná from the geoR Package contains the average rainfall over different years for the period May to June at 123 monitoring stations in Paraná state, Brazil.
Rainfall values measured at 143 recording stations in Paraná state, Brazil with low values being represented in blue and high values in red.
To illustrate how an empirical variogram is computed, consider the two highlighted locations below.
To illustrate how an empirical variogram is computed, consider the two highlighted locations below.
We can first compute the distance between the two locations using the Euclidean distance formula \[h = \sqrt{(475.1 - 403)^2 + (83.6 - 164.5)^2} = 108.36\]
Next, we compute the semi-variance between the points using their observed values as \[\begin{aligned}\gamma(\mathbf{s}, \mathbf{s}+\mathbf{h}) &= 0.5(Z(\mathbf{s}) - Z(\mathbf{s}+\mathbf{h}))^2 \\ &= 0.5(315.33 - 306.9)^2 = 35.53\end{aligned}\]
We repeat this process for every possible pair of points, and plot \(h\) against \(\gamma(\mathbf{s}, \mathbf{t})\) for each.
We can calculate the empirical variogram for the data using the variogram function from the gstat library.
This plot shows the semi-variances for each pair of points.
Each pair of points has a different distance, making it difficult to use this for prediction.
To make the variogram easier to use and interpret, we divide the distances into a set of discrete bins, and compute the average semi-variance in each.
The bins are illustrated on the left, and the empirical variogram obtained from them is shown on the right.
We can construct null envelope based on permutations of the data values across the locations, i.e. envelopes built under the assumption of no spatial correlation.
By overlapping these envelopes with the empirical variograms we can determine whether there is some spatial dependence in our data
We can construct permutation envelopes on the empirical variogram using the envelope function from the variosig R package.
In this example, we observe that the variogram only falls outside of the null envelope at distances \(<200\)m and also at distances above \(300\)m.
Once we have computed an empirical variogram, we have to think about model fitting.
[1] "There are 13 out of 15 variogram estimates outside the 95% envelope."
We treat the observed process of interest as being measured with error
\[ (\text{observed value})_i = (\text{true value at location } i) + (\text{error})_i \]
alternatively
\[ y_i = Z(\mathbf{s}_i) + \varepsilon_i \]
When geostatistical data are considered, we can often assume that there is a spatially continuous variable underlying the observations that can be modeled using a random field.
we have a process that is occurring everywhere in space \(\rightarrow\) natural to try to model it using some sort of function (of space)
a random field is a random function that generates smooth surfaces
This is hard
We typically make our lives easier by making everything Gaussian
A Gaussian random field (GRF) is a collection of random variables, where observations occur in a continuous domain, and where every finite collection of random variables has a multivariate normal distribution
\[ \mathbf{z} = (z(\mathbf{s}_1),\ldots,z(\mathbf{s}_m)) \sim N(\mu(\mathbf{s}_1),\ldots,\mu(\mathbf{s}_m),\Sigma), \]
where \(\Sigma_{ij} = \mathrm{Cov}(z(\mathbf{s}_i),z(\mathbf{s}_j))\) is a dense \(m \times m\) matrix.
We will assume our Gaussian process can be described as weakly stationary if the following criteria are met:
\(E[{Z(\mathbf{s})}] = \mu_z(\mathbf{s}) = \mu_z\) - a finite constant which does not depend on \(\mathbf{s}\).
\(\text{Cov}(Z(\mathbf{s}_i),Z(\mathbf{s}_j)) = C_z(\mathbf{s}_i-\mathbf{s}_j)\) - a finite constant which can depend on distance \((\mathbf{s}_i-\mathbf{s}_j)\).
Condition 1 states that our mean function must be constant in space, with no overall spatial trend.
Condition 2 states that for any two locations, their covariance depends only on how far apart they are (their spatial lag, \(h\)), not their absolute position.
Our process is said to be isotropic if the covariance function is directionally invariant. This means that the covariance only depends on the euclidean distance \((||\mathbf{s}_i-\mathbf{s}_j||)\) and not the direction.
The first step in defining a model for a random field in a hierarchical framework is to identify a probability distribution for the observations available at \(m\) sampled spatial locations and represented by the vector \(\mathbf{y} = y_1,\ldots,y_m\).
For example, if we assume our observations follow a Gaussian distribution then
\[ \begin{aligned} Y_i &\sim N(\mu_i,\tau_e^{-1})\\ \eta_i &=\mu_i = \beta_0 + \ldots + Z(\mathbf{s}_i) \end{aligned} \]
\(\tau_e^{-1} = \sigma^2_e\) represents the variance of the zero-mean measurement error (equivalent to the nugget effect)
The response mean \(\mu_i\) which coincides with the linear predictor \(\eta_i\) is defined based on:
the intercept \(\beta_0\) and any additional covariates
the realization of the latent (unobservable) GF \(Z(\mathbf{s}) \sim \mathrm{MVN}(0,\Sigma)\) which accounts for the spatial correlation through \(\Sigma = C_z(\cdot,\cdot)\).
A commonly used covariance function is the Matérn covariance function. The covariance of two points which are a distance \(h\) apart is:
\[ \Sigma =C_{\nu}(h) = \sigma^2 \frac{2^{1-\nu}}{\Gamma(\nu)} \left( \frac{\sqrt{2\nu} h}{\rho} \right)^{\nu} K_{\nu} \left( \frac{\sqrt{2\nu} h}{\rho} \right) \]
\(\Gamma(\cdot)\) is the gamma function
\(K_{\nu}(\cdot)\) is the modified Bessel function of the second kind.
Parameters \(\sigma^2\), \(\rho\) and \(\nu\) are non-negative values of the covariance function.
\(\sigma^2\) is the spatially structured variance component
\(\rho\) is the range of the spatial process
\(\nu\) controls smoothness of the spatial process.
The disadvantage of the modelling approach involving the spatial covariance function is known as “big n problem” and concerns the computational costs required for algebra operations with dense covariance matrices (such as \(\Sigma\)).
In particular dense matrix operations scale cubically with the matrix size, given by the number of locations where the process is observed. A computationally effective alternative is given by the stochastic partial differential equation (SPDE) approach
We define a (Matérn) GRF as the solution of a stochastic partial differential equation (SPDE)
\[ (\kappa^2-\Delta)^{\alpha/2}Z(t) = W(t) \]
What is this?
Imagine a guitar string stretched from left to right.
Now imagine someone randomly taps along it at many locations:
This is pure randomness. But a real string does not behave like this
Imagine a guitar string stretched from left to right.
\[(\color{red}{\kappa^2}-\Delta)Z(t) = W(t), ~~~\text{for } \alpha=2\]
Imagine a guitar string stretched from left to right.
\[ (\kappa^2-\Delta)^{\color{red}{\alpha}/2}Z(t) = W(t) \]
Ok…but we still need to solve the SPDE to find \(Z(t)\)!
Now we need to discretize the domain into T points (we cannot compute on the continuous!)
We represent our solution as
\[ Z(t) = \sum_{i = 1}^T\psi_i(t)w_i \]
Where
\(\psi_i(t)\) are (known) basis functions for nodes \(i=1,\ldots,T\)
Ok…but we still need to solve the SPDE to find \(Z(t)\)!
Now we need to discretize the domain into T points (we cannot compute on the continuous!)
We represent our solution as
\[ Z(t) = \sum_{i = 1}^T\psi_i(t)w_i \]
Where
\(\psi_i(t)\) are (known) basis functions for nodes \(i=1,\ldots,T\)
\(w_i\) are (unknown) weights
Ok…but we still need to solve the SPDE to find \(Z(t)\)!
Now we need to discretize the domain into T points (we cannot compute on the continuous!)
We represent our solution as
\[ Z(t) = \sum_{i = 1}^T\psi_i(t)w_i \]
Where
\(\psi_i(t)\) are (known) basis functions for nodes \(i=1,\ldots,T\)
\(w_i\) are (unknown) weights
This solution is then approximated using a finite combination of piece-wise linear basis functions.
The solution is completely defined by a Gaussian vector of weights with zero mean and a sparse precision matrix.
Now we approximate the GRF using a triangulated mesh.
The SPDE approach represents the continuous spatial process as a continuously indexed Gaussian Markov Random Field (GMRF)
Now we approximate the GRF using a triangulated mesh.
The SPDE approach represents the continuous spatial process as a continuously indexed Gaussian Markov Random Field (GMRF)
We construct an appropriate lower-resolution approximation of the surface by sampling it in a set of well designed points and constructing a piece-wise linear interpolant.
Note that \(\nu = \alpha - d/2\). For \(\alpha=2 \Rightarrow \nu= 1\) since \(d=2\) we have that:
Note
\[ \begin{aligned} Z(s) &= \sum_{i = 1}^K\psi_i(s)w_i \\ \mathbf{w} &\sim N(\mathbf{0},Q^{-1}) \leftarrow \text{GMRF}\\ Q^{-1} &= \tau^2(\kappa^4 \mathbf{C} + 2\kappa^2 \mathbf{G}+\mathbf{G}\mathbf{C}^{-1}\mathbf{G}) \end{aligned} \]
\(\mathbf{C}\) is diagonal with entries \(C_{ii} =\int \psi_i(s)\mathrm{d}s\) and measures how much of the domain each basis function covers.
\(G_{ij} = \int \nabla \psi_i(s) \nabla \psi_j(s) \mathrm{d}s\) reflects the connectivity of the mesh nodes.
because each basis function overlaps only with nearby ones, the resulting precision matrix is sparse, meaning each coefficient depends directly only on its neighbors
\[ Z(s) = \sum_{i = 1}^K\psi_i(s)w_i \]
The weights vector \(\mathbf{w} = (w_1,\dots,w_K)\) is Gaussian with a sparse precision matrix \(\longrightarrow\) Computational convenience
The field has two parameters
These parameters are linked to the parameters of the SPDE
We need to assign prior to them
Penalized Complexity (PC) priors proposed by Simpson et al. (2017) allow us to control the amount of spatial smoothing and avoid overfitting.
F. Lindgren, H. Rue, and J. Lindström. An explicit link between Gaussian fields and Gaussian Markov random fields: The SPDE approach (with discussion). In: Journal of the Royal Statistical Society, Series B 73.4 (2011), pp. 423–498.
H. Bakka, H. Rue, G. A. Fuglstad, A. Riebler, D. Bolin, J. Illian, E. Krainski, D. Simpson, and F. Lindgren. Spatial modelling with R-INLA: A review. In: WIREs Computational Statistics 10:e1443.6 (2018). (Invited extended review). DOI: 10.1002/wics.1443.
E. T. Krainski, V. Gómez-Rubio, H. Bakka, A. Lenzi, D. Castro-Camilio, D. Simpson, F. Lindgren, and H. Rue. Advanced Spatial Modeling with Stochastic Partial Differential Equations using R and INLA. Github version . CRC press, Dec. 20
Lets revisit the Paraná data containing the average rainfall over different years for the period May to June at 123 monitoring stations in Paraná state, Brazil.
Rainfall values measured at 143 recording stations in Paraná state, Brazil with low values being represented in blue and high values in red.
Stage 1 Model for the response \[ y(s)|\eta(s)\sim\text{Normal}(\mu(s),\sigma^2_e) \]
Stage 2 Latent field model \[ \eta(s) = \mu(s) = \beta_0 + Z(s) \]
Stage 3 Hyperparameters
Stage 1 Model for the response \[ y(s)|\eta(s)\sim\text{Normal}(\mu(s),\sigma^2_e) \]
Stage 2 Latent field model \[ \eta(s) = \mu(s) = \beta_0 + Z(s) \]
Stage 3 Hyperparameters
Stage 1 Model for the response \[ y(s)|\eta(s)\sim\text{Normal}(\mu(s),\sigma^2_e) \]
Stage 2 Latent field model \[ \eta(s) = \mu(s) = \beta_0 + Z(s) \]
Stage 3 Hyperparameters
First, we need to create the mesh used to approximate the random field.
max.edge for maximum triangle edge lengthsoffset for inner and outer extensions (to prevent edge effects)cutoff to avoid overly small triangles in clustered areasAll random field models need to be discretised for practical calculations.
The SPDE models were developed to provide a consistent model definition across a range of discretisations.
We use finite element methods with local, piecewise linear basis functions defined on a triangulation of a region of space containing the domain of interest.
Deviation from stationarity is generated near the boundary of the region.
The choice of region and choice of triangulation affects the numerical accuracy.
If the mesh is too fine \(\rightarrow\) heavy computation
If the mesh is to coarse \(\rightarrow\) not accurate enough
Some guidelines
Create triangulation meshes with fm_mesh_2d():
edge length should be around a third to a tenth of the spatial range
Move undesired boundary effects away from the domain of interest by extending to a smooth external boundary:
Use a coarser resolution in the extension to reduce computational cost (max.edge=c(inner, outer)), i.e., add extra, larger triangles around the border
Use a fine resolution (subject to available computational resources) for the domain of interest (inner correlation range) and avoid small edges ,i.e., filter out small input point clusters (0 \(<\) cutoff \(<\) inner)
Coastlines and similar can be added to the domain specification in fm_mesh_2d() through the boundary argument.
simplify the border
We use the inla.spde2.pcmatern to define the SPDE model using PC priors through the following probability statements.
\(P(\rho < 100) = 0.5\)
\(P(\sigma > 1) = 0.5\)
The Model
\[ \begin{aligned} y(s)|\eta(s)&\sim\text{Normal}(\mu(s),\sigma^2_e)\\ \eta(s) & = \color{red}{\boxed{\beta_0}} + \color{red}{\boxed{ Z(s)}}\\ \end{aligned} \]
Simple feature collection with 143 features and 1 field
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 150.122 ymin: 70.36 xmax: 768.5087 ymax: 461.9681
CRS: NA
First 3 features:
value geometry
1 306.09 POINT (402.9529 164.5284)
2 200.88 POINT (501.7049 428.771)
3 167.07 POINT (556.3262 445.2706)
The code
The Model
\[ \begin{aligned} y(s)|\eta(s)&\sim\text{Normal}(\mu(s),\sigma^2_e)\\ \color{red}{\boxed{\eta(s)}} & =\color{red}{\boxed{ \beta_0 + Z(s)}}\\ \end{aligned} \]
Simple feature collection with 143 features and 1 field
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 150.122 ymin: 70.36 xmax: 768.5087 ymax: 461.9681
CRS: NA
First 3 features:
value geometry
1 306.09 POINT (402.9529 164.5284)
2 200.88 POINT (501.7049 428.771)
3 167.07 POINT (556.3262 445.2706)
The code
The Model
\[ \begin{aligned} \color{red}{\boxed{y(s)|\eta(s)}} &\sim\text{Normal}(\mu(s),\sigma^2_e)\\ \eta(s) & =\beta_0 + Z(s)\\ \end{aligned} \]
Simple feature collection with 143 features and 1 field
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 150.122 ymin: 70.36 xmax: 768.5087 ymax: 461.9681
CRS: NA
First 3 features:
value geometry
1 306.09 POINT (402.9529 164.5284)
2 200.88 POINT (501.7049 428.771)
3 167.07 POINT (556.3262 445.2706)
The code
The Model
\[ \begin{aligned} y(s)|\eta(s) &\sim\text{Normal}(\mu(s),\sigma^2_e)\\ \eta(s) & =\beta_0 + Z(s)\\ \end{aligned} \]
Simple feature collection with 143 features and 1 field
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 150.122 ymin: 70.36 xmax: 768.5087 ymax: 461.9681
CRS: NA
First 3 features:
value geometry
1 306.09 POINT (402.9529 164.5284)
2 200.88 POINT (501.7049 428.771)
3 167.07 POINT (556.3262 445.2706)
The code
| Mean | 2.5% | 97.5% | |
|---|---|---|---|
| Intercept | 249.714 | 232.748 | 264.983 |
| Precision for the Gaussian observations | 4.482 | 3.197 | 5.511 |
| Range for space | 57.328 | 46.234 | 70.330 |
| Stdev for space | 46.654 | 41.222 | 52.736 |
Intercept represents the average rainfall valuesPrecision for the Gaussian observations are the observational errorsRange for space is the correlation of the Matérn fieldStdev for space is the marginal std deviation of the Matérn fieldIn geostatistical applications, the main interest resides in the spatial prediction of the spatial latent field or of the response variable in new locations
Suppose we observe a spatial process \({Z(s): s \in \mathcal{D}}\) at locations \(s_1,\dots,s_n\).
Our goal: predict the variable of interest at an unobserved location \(s_0 \in \mathcal{D}\).
given the data \(y = (y_1,\dots,y_n)\), what can we say about \(Z(s_0)\)?
Rather than a single guess, we want a full uncertainty-aware prediction.
In a Bayesian setting, prediction is a probabilistic task.
The key lies in the posterior predictive distribution
\[ \pi(\tilde{Y} \mid y) = \int \pi(\tilde{Y} \mid \Theta, y)\, \pi(\Theta \mid y)\, d\Theta, \] where \(\Theta\) denotes all latent components and hyperparameters.
Step 0 — Augment the data with prediction locations
Introduce locations where predictions are desired.
The response is set to NA at these locations.
Covariates and spatial coordinates are still provided.
Step 1 — Build the projector matrix A
Step 2 — Joint posterior inference
Step 3 — Posterior evaluation at prediction locations
Step 0 — Augment the data with prediction locations
Step 1 — Build the projector matrix A
A projector matrix \(A\) linking the latent Gaussian field to the prediction locations.
This ensures that the spatial effects and covariates at new locations are properly included in the model.
Linear predictors are computed at these new locations
Step 2 — Joint posterior inference
Step 3 — Posterior evaluation at prediction locations
Step 0 — Augment the data with prediction locations
Step 1 — Build the projector matrix A
Step 2 — Joint posterior inference
INLA computes the posterior of:
the latent Gaussian field,
fixed effects,
hyperparameters.
Step 3 — Posterior evaluation at prediction locations
Step 0 — Augment the data with prediction locations
Step 1 — Build the projector matrix A
Step 2 — Joint posterior inference
Step 3 — Posterior evaluation at prediction locations
Predictions come from the posterior marginals of the latent field and linear predictor.
Outputs include posterior means, variances, and credible intervals.
In the next example, we will explore data on the Pacific Cod (Gadus macrocephalus) from a trawl survey in Queen Charlotte Sound.
Envelope Variogram considering only where biomass \(>\) 0
[1] "There are 2 out of 15 variogram estimates outside the 95% envelope."
Envelope Variogram considering only where biomass \(>\) 0
[1] "There are 2 out of 15 variogram estimates outside the 95% envelope."
We thus have a dilemma.
Stage 1 Model for the response \[ y(s)|\eta(s)\sim \text{Tweedie}(p,\mu_i,\phi) \]
Tweedie account for positive continuous density values that also contain zeros
\(p\) determines the shape of the variance function (shifts from a Poisson distribution at \(p=1\) to a gamma distribution at \(p=2\))
\(\mu(s) = \exp \eta (i)\) is the mean linked to linear predictor by the log link.
\(\phi\) = dispersion parameter .
Stage 2 Latent field model \[ \eta(s) = \text{log}(\mu(s)) = \beta_0 + \beta_1 \text{depth} + \beta_2 \text{depth}^2 \]
Stage 3 Hyperparameters
Stage 1 Model for the response \[ y(s)|\eta(s)\sim \text{Tweedie}(p,\mu_i,\phi) \]
Stage 2 Latent field model\[ \eta(s) = \exp(\mu(s)) = \beta_0 + \beta_1 x(s) + \beta_2 x(s)^2 \]
Stage 3 Hyperparameters
| INLA Model Results | |||
|---|---|---|---|
| Posterior summaries of fixed effects and hyperparameters | |||
| Parameter | Mean | 2.5% Quantile | 97.5% Quantile |
| Intercept | 3.866 | 3.690 | 4.043 |
| depth | −1.269 | −1.480 | −1.059 |
| depth2 | −1.077 | −1.256 | −0.898 |
| p parameter for Tweedie | 1.643 | 1.617 | 1.668 |
| Dispersion parameter for Tweedie | 3.804 | 3.530 | 4.093 |
Stage 1 Model for the response(s) \[ \begin{aligned} y_i|\eta^{(1)}_i&\sim \text{Binomial}(1,\pi_i)\\ \log(z_i)|\eta^{(2)}_i&\sim \text{Normal}(\mu_i,\tau_e^{-1})\\ \end{aligned} \]
We then define a likelihood for each outcome.
\(y_i =\begin{cases} 1 &\text{if fishes have been caught at location } \mathbf{s}_i \\ 0 &\text{otherwise}\end{cases}\)
\(z_i =\begin{cases} NA &\text{if no fish were caught at location } \mathbf{s}_i \\ \text{biomass density at location } \mathbf{s}_i &\text{otherwise}\end{cases}\)
Stage 2 Latent field model \[ \begin{aligned} \eta^{(1)}_i &= \text{logit}(\pi_i) = X'\beta + \xi_i\\ \eta^{(2)}_i &= \mu_i = X'\alpha + \omega_i \end{aligned} \]
Stage 3 Hyperparameters
Stage 1 Model for the response(s) \[ \begin{aligned} y_i|\eta^{(1)}_i&\sim \text{Binomial}(1,\pi_i)\\ \log(z_i)|\eta^{(2)}_i&\sim \text{Normal}(\mu_i,\tau_e^{-1})\\ \end{aligned} \]
Stage 2 Latent field model \[ \begin{aligned} \eta^{(1)}_i &= \text{logit}(\pi_i) = X'\beta + \xi_i\\ \eta^{(2)}_i &= \mu_i = X'\alpha + \omega_i \end{aligned} \]
Stage 3 Hyperparameters
| Parameter | Mean | 2.5% Quantile | 97.5% Quantile |
|---|---|---|---|
| \[\alpha_0\] | 3.397 | 3.074 | 3.721 |
| \[\alpha_1\] | −0.321 | −0.737 | 0.094 |
| \[\alpha_2\] | −0.238 | −0.613 | 0.138 |
| \[\beta_0\] | 0.531 | 0.170 | 0.892 |
| \[\beta_1\] | −1.163 | −1.574 | −0.751 |
| \[\beta_2\] | −0.829 | −1.133 | −0.525 |
| \[\tau_\epsilon^2\] | 0.542 | 0.404 | 0.706 |
| \[\rho^{[1]}\] | 122.738 | 10.845 | 542.705 |
| \[\tau_{d,1}\] | 2.126 | 0.174 | 8.907 |
| \[\rho^{[2]}\] | 125.147 | 11.047 | 557.310 |
| \[\tau_{d,2}\] | 2.140 | 0.177 | 8.980 |
Note that in the hurdle model there is no direct link between the parameters of the two observation parts.
the two likelihoods could share some of the components; for example the Matérn field could be used for both predictors.
What does the previous results suggest in terms of the estimated covariance parameters for the two fields? is it sensible to share the same component between the two parts?
We will fit a model that estimates this field jointly and compare it with our two previous models
The model being fitted is now:
\[ \begin{aligned} y_i|\eta^{(1)}_i&\sim \text{Binomial}(1,\pi_i)\\ \eta^{(1)}_i &= \text{logit}(\pi_i) = X'\beta + \color{red}{\xi_i}\\ \log(z_i)|\eta^{(2)}_i&\sim \text{Normal}(\mu_i,\tau_e^{-1})\\ \eta^{(2)}_i &= \mu_i = X'\alpha + \color{red}{\xi_i} \end{aligned} \]
Note that in the hurdle model there is no direct link between the parameters of the two observation parts.
the two likelihoods could share some of the components; for example the Matérn field could be used for both predictors.
What does the previous results suggest in terms of the estimated covariance parameters for the two fields? is it sensible to share the same component between the two parts?
We will fit a model that estimates this field jointly and compare it with our two previous models
| Model | DIC | WAIC | MLIK |
|---|---|---|---|
| Tweedie | 1,599.690 | 1,612.043 | −979.532 |
| Hurdle | 1,286.775 | 1,286.528 | −682.896 |
| Hurdle 2 | 1,286.642 | 1,286.400 | −681.787 |
We need to compute:
\(\pi(s)\) = Catching probability
\(\mathbb{E}[Z(s)|Y(s)] = \exp\left(\mu(s) + \dfrac{1}{2\tau_{e}}\right)\)
\(\mathbb{E}(Z(s)) =\pi(s)\times \mathbb{E}[Z(s)|Y(s)]\)